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TEACHER EVALUATION SYSTEMS INTRODUCED by states and 

school systems in the past several years have focused attention on improving the perfor- 
mance of public school teachers, but they have been cost- and time-intensive, placing a 
significant burden on states’ and districts’ resources. In Tennessee, for example, trained 
evaluators conducted nearly 300,000 classroom observations during the 2011-2012 


school year, prompting administrators to 
complain that “the amount of time spent 
to implement TEAM [the state’s new sys- 
tem] was unmanageable.” 1 

Even among school leaders who did not 
feel overburdened by the number of hours 
spent observing classrooms, many felt the 
time had not been used efficiently, as the 
system treated all teachers with the same 
intensity, despite the fact that teachers’ 
skills and needs for support varied widely. 

To address these concerns, Tennessee and 
other systems have replaced their one-size- 
fits-all evaluation approaches with more 
differentiated models, using past perfor- 


mance data to determine which teach- 
ers should be evaluated with more or less 
intensity in subsequent evaluation cycles 
and, in some cases, what that attention 
should include. This brief explores differ- 
entiation strategies in nine districts, two 
charter management organizations, and 
three states, Tennessee, Delaware, and 
Ohio. 2 Interviews with system leaders and 
analyses of teacher evaluation policies re- 
veal that these systems now vary the for- 
mat or frequency of formal evaluation 
cycles, the format or frequency of class- 
room observations, or the type of ob- 
server conducting classroom observations, 
based on what is known about teachers’ 
needs, strengths, and goals. 3 
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Many of these school systems have embraced differ- 
entiation strategies as a way to conserve teacher evalu- 
ation resources or to deploy existing resources more 
efficiently. In some of the systems, however, differ- 
entiation strategies have required increased resourc- 
es, as system leaders have introduced more frequent 



Common Strategies for Differentiating Teacher Evaluations 

classroom observation or trained additional observ- 
ers (e.g., peers, coaches). But even in those instances, 
officials claim differentiation strategies have helped 
make evaluation systems more attentive to teachers’ 
individual needs for supervision and support. And 
that, they say, is likely to lead to more effective teacher 
evaluation systems. 

Flexibility for Top Performers 

Some school systems have sought to shrink princi- 
pals’ workloads by reducing the amount of time they 
spend evaluating top-performing teachers. In Ohio, 


for example, districts adopting the state’s model eval- 
uation system can opt to evaluate Accomplished teach- 
ers bi-annually instead of annually, the requirement 
for teachers rated Proficient or below . 4 Delaware’s state 
system allows for a similar approach, although, un- 
like in Ohio, top-performers are not removed entirely 
from the evaluation cycle; they receive a “student im- 
provement score” and at least one observation annu- 
ally, but a full performance evaluation is conducted 
only once every two years, thereby reducing evalua- 
tors’ administrative workloads. 

Though these approaches can save time, critics worry 
that such policies may discourage top-performers by 
limiting their opportunities for feedback and profes- 
sional growth . 5 To ensure top-performers are still en- 
gaged in such opportunities, Burlington, VT; Provi- 
dence, RI; and Pittsburgh, PA, allow their strongest 
teachers to complete alternative projects (action re- 
search, self-directed study, etc.) in lieu of traditional 
evaluations. In all three cases, teachers receive annual 
performance evaluations based on their progress to- 
ward specific project goals or outcomes, but the pro- 
cess is still generally less resource-intensive for school 
administrators, who draw heavily on evidence from 
teachers’ self-reflections and peers’ assessments of 
their progress to determine final evaluative ratings. 
For these three districts, alternative cycles are also a 
way to reward top-performers with greater autonomy 
to design and monitor their own professional growth. 

New Formats, Tailored Frequency 

Even in school systems that require traditional perfor- 
mance evaluations for all teachers annually, officials 
now vary how and how often they observe teachers 
based on past performance data. 
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Increasingly, districts are experimenting with differ- 
ent formats for classroom observation, recognizing 
that traditional, full-length observations may not be 
the best use of observers’ time, or the best way to 
gather data about teachers’ performance. Delaware, 
for example, now requires districts to use combina- 
tions of announced and unannounced classroom 
observations. Tenured teachers rated Highly Effective 
or Effective participate in one announced observa- 
tion each year, while their lower-performing peers 
must undergo at least one additional unannounced 
observation, which provides evaluators a chance to 
“watch a teacher in action without providing prior 
notice .” 7 The RISE system in Pittsburgh and Ten- 
nessee’s TEAM system also use combinations of an- 
nounced and unannounced observations, depending 
on a teacher’s past performance. 

Several districts also use walkthroughs — informal 
check-in visits during which observers spend a brief 
amount of time in a teacher’s classroom focusing on a 
specific skill or behavior, rather than an entire domain 
of the observation rubric (as would occur during a 
full-length observation). This shortened format has 
become a popular way for observers to gather more 
frequent and more focused evidence of teacher prac- 
tice, making it easier for them to provide regular, tar- 
geted feedback to teachers on specific areas of need — 
a task that’s difficult to do well when an observer must 
evaluate several areas of a teacher’s practice simultane- 
ously. 

The charter school network Achievement First, con- 
vinced these shortened observations are the key to 
driving improvement in teacher practice, has reduced 
the number of required formal observations to cre- 
ate more time for these shorter, growth-oriented con- 


versations to occur — especially for teachers who need 
more frequent support. Many other early adopters 
also note that because walkthroughs are shorter and 
do not require careful scheduling (most are unan- 
nounced), they allow for more frequent interaction 
between teachers and their observers, a feature which 
helps build trust and comfort with the process. 

Called by different names in different places (e.g., par- 
tials, rounds, etc.), versions of these narrowly focused, 
abbreviated observations have gained popularity in 
Denver, Providence, Pittsburgh, at the charter net- 
work DC Prep, and elsewhere. Though many school 
systems label these observations informal, it’s worth 

Traditional, full-length 
observations may not be the best 
way to gather data about teachers’ 
performance. II 


pointing out that informal does not always indicate 
that these observations are entirely disconnected from 
teachers’ formal performance evaluations. To the con- 
trary, many districts take input from those who con- 
duct informal observations into account in teachers’ 
formal evaluations, though how much weight they 
carry varies considerably (and is often not formally 
defined). 

To take advantage of these new observation formats 
and to use observers’ time more strategically, districts 
and states have also rethought how ofien they observe 
teachers. In Tennessee, for example, the state board 
of education recently modified requirements for the 
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number of observations teachers will undergo under 
the state’s TEAM evaluation system, differentiating 
the frequency based on a teacher’s licensure status and 
his or her last overall evaluation score. 


assessment of a teacher’s performance. 8 Seattle, Den- 
ver, Providence, Burlington, Achievement First, DC 
Prep, and Pittsburgh encourage similar case-by-case 
differentiation. 


This change, driven by feedback from teachers and 
administrators after the first year of statewide TEAM 
implementation, is intended to allow evaluators to 
“spend more time with the teachers most in need of 
improvement, while reducing the amount of time 
spent with teachers whose student outcomes dem- 
onstrate strong performance.” 6 Teachers in Tennessee 
now undergo one, two, three, or four full-length ob- 
servations depending on their prior performance (a 
composite score of value-added measures and obser- 
vation scores) and licensure status. 

A similar system exists in Hillsborough County, FL, 
where observers conduct 1 1 observations of the dis- 
trict’s lowest performing teachers, but only three for 
its highest-performers. And, in Providence, a new 
Peer Assistance and Review program will provide at 
least 15 observations and coaching sessions to the dis- 
trict’s Ineffective or Developing teachers, in addition to 
the observation and support they receive through the 
district’s standard evaluation process. 7 

Other school systems set a minimum number of for- 
mal observations, but allow building administrators 
to conduct additional observations — whether formal 
or informal — according to teachers’ demonstrated 
needs and the school’s available resources. In Jefferson 
County, CO, for example, teachers must be formally 
observed three times annually, but schools’ Instruc- 
tional Leadership Teams can opt to conduct between 
four and ten additional observations based on their 


New Sets of Eyes 

To distribute the responsibility for classroom ob- 
servations and other evaluation duties, many states 
and districts now extend observation responsibilities 
beyond administrators to include high-performing 
peers, coaches, and other instructional leaders in the 
evaluation process. 9 Doing so not only reduces princi- 
pal workloads, but can also provide specialized exper- 
tise in particular disciplines (e.g., content, grade-level, 
or ELL expertise). Using data from prior evaluations, 
school systems now deploy these observers more stra- 
tegically, using different combinations of observers 
for teachers with different needs. 


(( Many states and districts now 
extend observation responsibilities 
beyond administrators alone, including 
high-performing peers, coaches, and 
other instructional leaders. 


In Hillsborough County, teachers undergo a specific 
combination of administrator, peer, and supervisor 
observations, depending on their evaluation ratings 
from the previous year. While most teachers in 
Hillsborough are observed by some combination of 
administrators and peers, those rated “unsatisfactory” 
undergo additional supervision by content supervisors. 


CARNEGIE FOUNDATION FOR THE ADVANCEMENT OF TEACHING 




EVALUATING TEACHERS MORE STRATEGICALLY 


5 


New Haven mandates a similar combination system, 
relying on what the district calls “Third Party Vali- 
dators” (TPVs) to conduct observations of teachers 
rated at the high or low extremes of the district’s scale 
by their supervisors; TPVs conduct three extra obser- 
vations for teachers on track to receive Needs Improve- 
ment ratings and two for those projected to be rated 
Exemplary. By employing these trained external ob- 
servers, New Haven validates the ‘extreme’ scores and 
introduces additional objectivity to the process — all 
without creating more work for principals and other 
instructional leaders. 

More commonly, policy dictates a minimum num- 
ber of observations teachers must receive, but pro- 
vides flexibility for school leaders to determine who 
conducts those observations based on a teacher’s 
past performance and content- area (or grade-level). 
In Pittsburgh, for example, administrators and mas- 
ter teachers called Instructional Teacher Leader 2s 
(ITL2s) share caseloads of “high touch” teachers (pre- 
tenure and low performers) and “low touch” teachers 
(average and high-performers with tenure), with the 
ITL2s taking on more “low touch” cases. 10 Similar 
discretionary differentiation can be found in Marico- 
pa County, AZ; Jefferson County, Denver, and Provi- 
dence. Ohio’s Teacher Evaluation System (OTES), 
which also gives school leaders this discretion, goes 
one step further, giving teachers with above average 
student growth scores the option to select their own 
“credentialed evaluator” for formal observations. 11 

In some cases, alternative observers often receive extra 
compensation (stipends and/or release time) and in- 
tensive, on-going training in observation, evaluation, 
or coaching techniques. Some districts also use exter- 


nal contractors to observe teachers or to score videos 
of teaching. 12 Though none of the school systems 
could provide Carnegie cost estimates 13 , several re- 
ported the addition of alternative observers had been 
expensive, but, they believe, ultimately effective in 
providing more supervision and support for teachers. 

Realistically, many districts do not have the capacity 
to build and maintain cadres of alternative evaluators, 
particularly given how many are struggling to provide 
sufficient training to principals and other traditional 
evaluators. And little is known about exactly how to 
select, train, and deploy these alternative evaluators 
most effectively or most efficiently. But given that 
the Gates Foundation’s MET project concluded that, 
“adding a second trained observer increases reliabil- 
ity significantly more than having the same observer 
score an additional lesson,” it seems likely that the use 
of multiple observers will become more common. 14 


Though it is too soon to quantify the impact these dif- 
ferentiation strategies have had on teacher evaluation 
systems (most have been in place for less than three 
years), evidence from the education systems in this 
study suggests that they have helped districts deploy 
their resources more strategically. Even where system 
leaders have added new components (e.g., training for 
new types of observers) to their evaluation systems, 
they report that their investment in more adaptable 
evaluation systems has allowed them to better match 
teachers with the supervision and support they need. 
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STATES 


Delaware 

Ohio 

Tennessee 

Burlington, 

VT 

Denver, 

CO 

Hillsborough 
County, FL 


SYSTEM DATA 1 

Students 

131,029 

1.87 million 

934,000 

3,362 

78,339 

194,525 

Schools 

216 

3,305 

1,728 

10 

158 

305 

Teachers 

8,594 

110,000 

64,227 

343 

4,681 

13,469 

EVALUATION SYSTEM 

System or Framework Title 

DPASII 

Ohio Teacher Evaluation 
System (OTES) 

TEAM 

Differentiated Teacher 
Supervision and 
Evaluation 
System 

FEAP 

Empowering Effective 
Teachers 

System-wide Implementation Date 

2008-2009 

2013 

2011-2012 

2007-2008 

2012-2013 

2010-2011 

Components of Final Score 

Matrix of 5 Components: 
4 related to classroom 
practice, 1 related to 
student improvement 

50% Teacher 
Performance + 
50% Student 
Performance 

35% Student Growth 
+ 15% Academic 
Achievement + 
50% Observation 

No Numerical Scores 
Provided. Qualitative 
feedback based on 
observation and 
conversations 
between teachers 
& administrators 

30% Observation + 
10% Professionalism + 
10% Student Perception 
Data + 50% Student 
Achievement Data 

30% Principal Appraisal 
+ 30% Peer/Mentor 
Appraisal + 40% 
Student Achievement 
Gains 

Minimum Number of Formal Observations 
Required Annually 

1 to 3 

2 

1 to 4 

0 to 4 

2 to 4 

3 to 11 

Observation Formats Used 

Formal Announced & 
Unannouced 

30 min. Formal & 
Informal Walkthroughs 

Formal Announced 
& Unannounced, 
Walkthroughs 

Formal and Informal 

45-60 min. Formal, 20 
min. Partials, 10 min. 
Walkthroughs, Peer 

Admin. Formal & 
Informal, Peer Formal 
& Informal, Supervisor 
Formal 

Effectiveness Fevels 

4 

4 

5 

3 

4 

4 


WHO CONDUCTS OBSERVATIONS? 


Administrators 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

Alternative Observers 

(Peers, Mentors, Coaches, Third Party, etc.) 

Yes 

Yes 

District discretion 

Yes 

Yes 

Yes 


BASED ON PRIOR EVALUATION RESULTS, DOES THE SYSTEM DIFFERENTIATE.... 


Freqency or type of evaluation cycle? 

Yes 

Yes 

No 

Yes 

No 

No 

Frequency or type of mandatory classroom 
observations? 

Yes 

No 8 

Yes 

Yes 

No 

Yes 

Observer Type? 

No 

District discretion 

District discretion 

Yes 

No 

Yes 


1 System data from NCES (2010-201 1 ) or from CMO leadership. 

2 Data in this table describes a pilot program in Jefferson County, CO. The pilot, funded by TIF, impacts the number of students, schools, and teachers in parenthesis. 

3 The REIL project directed by the Maricopa County Education Service Agency includes six districts (Alhambra, Gila Bend, Isaac, Nadaburg, Phoenix Union, and Tolleson) in greater Phoenix, AZ. 

4 Data for Providence, Rhode Island represents policies set by the Rhode Island Innovation Consortium, of which Providence is a member. Some of these policies have been provisionally approved 
but have not yet been enacted. For more information, see the profile for the district. 

5 In Pittsburgh, high-performing teachers doing Supported Growth Projects in lieu of the formal RISE process are not observed formally while in a project year. 
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6 Tenured teachers in Providence who receive a Highly Effective Rating in their PPG&R domain can participate in the district's Differentiated Model of evaluation which requires two informal 
observations, but no formal observation. 

1 Seattle's STAR program uses veteran teachers as mentors for novice teachers, but their observations only supplement those conducted by an administrators; they are not formal 
observations. 

8 These districts set minimum requirements for annual formal evaluations but allow school-level leadership to conduct additional observations at their discretion. The frequency and format 
of these additional observations varies across districts. 
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ENDNOTES 


1 Tennessee Board of Education. (2012). Teacher 
Evaluation in Tennessee: A Report on Year 1 Imple- 
mentation: 20. 

2 This brief considers the model evaluation systems 
created by the state departments of education in Ten- 
nessee (TEAM) and Ohio, but recognizes that other 
district-specific systems are in place in those states. 
Delaware’s DPAS II’s system is mandatory for all dis- 
tricts. 

3 Many districts also provide additional support for 
novice teachers, regardless of their evaluation results. 
Though these programs are promising strategies for 
improving practice and retaining teachers, this brief 
does not include such programs because they tend to 
differentiate based on experience rather than perfor- 
mance. Additionally, most are supportive, rather than 
evaluative in nature. 

4 Accomplished and Proficient are rating categories 
used by the Ohio Department of Education. Elere 
and throughout the brief, systems’ terminology will be 
used when describing rating categories. 

5 Ohio’s state model allows for alternative evaluation 
formats for top-performers. It is not clear how many 
districts have chosen or will choose to introduce such 
formats. 

6 Delaware Department of Education. (2013). DPAS 
II: Revised Guide for Teachers: 50. 

7 Providence’s PAR program provides similarly in- 
tensive support to the district’s novice teachers and 
to veteran teachers new to the district. The program’s 
purpose is to provide support to improve teachers’ 
practice, rather than to provide more opportunities for 
evaluative observation. Seattle’s STAR program pro- 
vides similarly intense, growth-oriented mentoring for 
novice teachers as well. 


8 This policy applies only to the 20 schools participat- 
ing in the Jefferson County Public Schools’ Teacher 
Incentive Fund Pilot project. 

9 For more on the impact of multiple observers, see 
Flo, A.D. and Kane, T.J. (2013). The Reliability of 
Classroom Observations by School Personnel. Bill & 
Melinda Gates Foundation. 

10 ITL2s cannot issue final formal evaluations of 
teachers, however, even if their observations are a part 
of that score. In Pittsburgh and many other school sys- 
tems throughout the country, only administrators may 
issue summative ratings. 

11 Teachers with average student growth scores can 
“have input” in selecting their evaluator; those with 
below average scores have no say and are assigned an 
evaluator. 

12 Teachscape provided this service to the Measures 
of Effective Teaching project and serves as a clear ex- 
ample of how this might work. See McClellan et. al. 
( 2012 ). 

13 The Carnegie Foundation has developed an online 
cost calculator to help district employees and mem- 
bers of the K-12 community understand the different 
components of designing a district’s teacher evalua- 
tion system. For more information, please visit: http:// 
commons.carnegiefoundation.org/what-we-are-learn- 
ing/2013/carnegie-cost-calculator-a-tool-for-explor- 
ing-the-cost-of-educator-evaluation-systems/ 

14 The Bill & Melinda Gates Foundation. (2013) “En- 
suring Fair and Reliable Measures of Effective Teach- 
ing: Culminating Findings from the MET Project’s 
Three- Year Study.”: 5. 
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